Search CORE

1 research outputs found

Validating a set of Japanese EFL proficiency tests: demonstrating locally designed tests meet international standards

Author: Dunlea Jamie
Publication venue: University of Bedfordshire
Publication date: 01/12/2015
Field of study

A thesis submitted to the University of Bedfordshire in fulfillment of the requirements for the degree of Doctor of PhilosophyThis study applied the latest developments in language testing validation theory to derive a core body of evidence that can contribute to the validation of a large-scale, high-stakes English as a Foreign Language (EFL) testing program in Japan. The testing program consists of a set of seven level-specific tests targeting different levels of proficiency. This core aspect of the program was selected as the main focus of this study. The socio-cognitive model of language test development and validation provided a coherent framework for the collection, analysis and interpretation of evidence. Three research questions targeted core elements of a validity argument identified in the literature on the socio-cognitive model. RQ 1 investigated the criterial contextual and cognitive features of tasks at different levels of proficiency, Expert judgment and automated analysis tools were used to analyze a large bank of items administered in operational tests across multiple years. RQ 2 addressed empirical item difficulty across the seven levels of proficiency. An innovative approach to vertical scaling was used to place previously administered items from all levels onto a single Rasch-based difficulty scale. RQ 3 used multiple standard-setting methods to investigate whether the seven levels could be meaningfully related to an external proficiency framework. In addition, the study identified three subsidiary goals: firstly, toevaluate the efficacy of applying international standards of best practice to a local context: secondly, to critically evaluate the model of validation; and thirdly, to generate insights directly applicable to operational quality assurance. The study provides evidence across all three research questions to support the claim that the seven levels in the program are distinct. At the same time, the results provide insights into how to strengthen explicit task specification to improve consistency across levels. This study is the largest application of the socio-cognitive model in terms of the amount of operational data analyzed, and thus makes a significant contribution to the ongoing study of validity theory in the context of language testing. While the study demonstrates the efficacy of the socio-cognitive model selected to drive the research design, it also provides recommendations for further refining the model, with implications for the theory and practice of language testing validation

University of Bedfordshire Repository